perf(core): speed up vector sync and tune fastembed defaults#731
perf(core): speed up vector sync and tune fastembed defaults#731phernandez merged 12 commits intomainfrom
Conversation
|
Claude finished @phernandez's task in 3m 14s —— View job PR Review: perf(core): speed up vector sync and tune fastembed defaultsTasks:
SummaryThis is a solid, well-motivated performance PR. The benchmark numbers demonstrate a real speedup (~18% on wall clock, ~22% reduction in embed time), and the code follows the project's literate programming conventions throughout. Below is the full checklist and specific findings. Code Quality & Standards
Minor: magic number for window size cap
return max(1, min(self._semantic_embedding_sync_batch_size, 8))The _SQLITE_VECTOR_PREPARE_WINDOW_MAX = 8Testing & Documentation
Basic Memory Architecture
Potential:
# The `with telemetry.started_span(...) as batch_span:` block closes above
if batch_span is not None:
batch_span.set_attributes({...})The Window-level exception collapse
except Exception as exc:
return [exc for _ in entity_ids]When the shared read session fails, all entities in the window get the same exception. This is the right fail-safe — we can't know which entities would have been OK — but worth a comment noting that a window-level DB failure marks all window entities as failed (which is already visible in the outer warning log). Security & Performance
Compatibility
Checklist Result
Overall: Approve with minor suggestions. The core batching and FastEmbed tuning work is correct and well-tested. The three flagged items are all minor — the unicode |
Signed-off-by: phernandez <paul@basicmachines.co>
Signed-off-by: phernandez <paul@basicmachines.co>
Signed-off-by: phernandez <paul@basicmachines.co>
8c75457 to
66eb72d
Compare
|
Claude finished @phernandez's task in 2m 28s —— View job PR Review: perf(core): speed up vector sync and tune fastembed defaultsTasks:
SummaryThis PR delivers two distinct improvements: (1) batched window reads for vector-sync prepare phase to eliminate per-entity round-trips on both SQLite and Postgres, and (2) CPU-aware FastEmbed thread/parallel defaults to tune local full-rebuild performance. The benchmark numbers are compelling (~15% wall-clock reduction). Overall the implementation is solid and follows project conventions well. A few items worth addressing before merge: Code Quality & Standards✅ Passes
|
| Category | Status | Notes |
|---|---|---|
| Coding conventions (CLAUDE.md) | ✅ | Follows literate style, section headers, decision comments |
| Python 3.12+ type annotations | ✅ | Full coverage |
| SQLAlchemy 2.0 async | ✅ | No sync patterns |
| FastAPI/Typer conventions | ✅ | |
| 100-char line limit | ✅ | |
| No commented-out code | ✅ | |
| Unit tests | ✅ | Extensive additions |
| Integration tests | ✅ | |
| Edge case coverage | ✅ | Window errors, oversized entities, orphan runtime |
| 100% test coverage | ✅ | Appropriate pragma: no cover usage |
| Documentation/docstrings | ✅ | |
| MCP atomic design | n/a | No new MCP tools |
| DB migrations | ✅ | No schema changes; tables created at runtime as before |
| Local-first architecture | ✅ | |
| Knowledge graph consistency | ✅ | |
| No hardcoded secrets | ✅ | |
| Input validation | ✅ | |
| Error handling/logging | ✅ | |
| Performance | ✅ | Core improvement of this PR |
| No sensitive data in logs | ✅ | |
| Windows path compatibility | ✅ | Uses Path objects |
| No emojis/unicode in output | ❌ | ✓ in 2 new console.print lines in db.py |
Required Before Merge
- Replace
✓with ASCII text indb.pylines 276 and 325 (compatibility rule from CLAUDE.md)
Suggested (Non-blocking)
- Extract named constants for
8in_vector_prepare_window_sizebase and_resolve_fastembed_runtime_knobs - Consolidate the two identical early-exit delete blocks in
_prepare_entity_vector_jobs_prefetched - Move
_effective_parallel()call into aruntime_log_attrs()provider method to avoid base class reaching into private FastEmbed internals
Signed-off-by: phernandez <paul@basicmachines.co>
66eb72d to
a68933a
Compare
|
Claude finished @phernandez's task in 2m 45s —— View job PR Review: perf(core): speed up vector sync and tune fastembed defaultsTasks:
SummaryThis is a solid performance PR delivering batched window reads for the vector-sync prepare phase and CPU-aware FastEmbed thread/parallel defaults. The benchmark numbers (down from ~11:15 to ~9:04 wall clock, embed time from ~625s to ~486s) demonstrate a meaningful real-world improvement. What was fixed since previous reviews✅ Unicode Code Quality & Standards✅ Passes
|
| Area | Status | Notes |
|---|---|---|
| Code conventions (CLAUDE.md) | ✅ Pass | Excellent literate-style comments throughout |
| Python 3.12+ type annotations | ✅ Pass | |
| SQLAlchemy 2.0 async | ✅ Pass | |
| FastAPI/Typer conventions | ✅ Pass | |
| 100-char line limit | ✅ Pass | |
| No commented-out code | ✅ Pass | |
| Unit tests | ✅ Pass | Extensive additions |
| Integration tests | ✅ Pass | force_full path validated |
| Edge case coverage | ✅ Pass | Window errors, oversized entities, orphan runtime |
| 100% test coverage | ✅ Pass | Appropriate pragma: no cover usage |
| No MCP tool changes | ✅ Pass | |
| DB migrations | ✅ Pass | No schema changes; vector tables recreated at runtime |
| Local-first architecture | ✅ Pass | |
| No hardcoded secrets | ✅ Pass | |
| Input validation | ✅ Pass | |
| No unicode/emoji in output | ✅ Pass | Fixed: ✓ → done |
| Windows path compatibility | ✅ Pass | |
started_span timing |
set_attributes after span context exits |
|
Magic number 8 |
Two instances — could be named constants | |
| Duplicated delete blocks | _prepare_entity_vector_jobs_prefetched lines 1267–1297 |
|
Private _effective_parallel() |
Base class reaches into provider internals |
Overall: Approve. The blocking compatibility issue from previous reviews (unicode ✓) has been fixed. All remaining items are minor non-blocking suggestions. The core batching and FastEmbed tuning work is correct, well-tested, and follows project conventions throughout.
Signed-off-by: phernandez <paul@basicmachines.co>
|
Claude finished @phernandez's task in 1m 58s —— View job PR Review: perf(core): speed up vector sync and tune fastembed defaultsTasks:
SummaryAll suggestions from previous review rounds have been addressed in What was fixed in the latest commit
Final Checklist
Overall: Approve. The implementation is correct, well-tested, and follows project conventions throughout. No remaining issues. |
Signed-off-by: phernandez <paul@basicmachines.co>
|
Claude finished @phernandez's task in 3m 11s —— View job PR Review: perf(core): speed up vector sync and tune fastembed defaultsTasks:
What changed in
|
| Area | Status | Notes |
|---|---|---|
| Code conventions (CLAUDE.md) | ✅ Pass | Trigger/Why/Outcome comment present |
| Python 3.12+ type annotations | ✅ Pass | No new code |
| SQLAlchemy patterns | n/a | Config-only change |
| Unit tests | ✅ Pass | No test updates needed; existing tests use explicit values |
| Integration tests | ✅ Pass | |
| 100% coverage | ✅ Pass | |
| No hardcoded secrets | ✅ Pass | |
| No unicode/emoji in output | ✅ Pass | Already fixed in prior commit |
| Performance | ✅ Pass | Benchmark-validated |
| All prior review items | ✅ Pass | All addressed in d59b974 |
Overall: Approve. This is a clean, minimal, benchmark-backed config tuning on top of an already-approved implementation. All previous review findings remain resolved.
Signed-off-by: phernandez <paul@basicmachines.co>
|
Claude finished @phernandez's task in 2m 17s —— View job PR Review: perf(core): speed up vector sync and tune fastembed defaultsTasks:
What changed in
|
| Area | Status | Notes |
|---|---|---|
| Code conventions (CLAUDE.md) | ✅ Pass | Trigger/Why/Outcome on every changed branch |
| Python 3.12+ type annotations | ✅ Pass | |
| SQLAlchemy 2.0 async | n/a | |
| FastAPI/Typer conventions | n/a | |
| 100-char line limit | ✅ Pass | |
| No commented-out code | ✅ Pass | |
| Unit tests | ✅ Pass | Regression test covers the CRLF divergence case |
| Integration tests | ✅ Pass | Existing write tests cover the write path |
| Edge case coverage | ✅ Pass | Monkeypatched CRLF write simulates the actual failure |
| 100% test coverage | ✅ Pass | pragma: no cover used appropriately |
| DB migrations | n/a | No schema changes |
| Local-first architecture | ✅ Pass | |
| No hardcoded secrets | ✅ Pass | |
| Input validation | ✅ Pass | |
| No unicode/emoji in output | ✅ Pass | Already fixed in prior commit |
| Windows path compatibility | ✅ Pass | This commit specifically resolves the Windows checksum bug |
Empty if body in write path |
pass is correct but could use an explanatory comment |
Overall: Approve. This is a clean, targeted, correctly scoped fix. The dual-layer approach (prevent CRLF at the write source + always hash from disk) is defensive in depth and the right design. All previous review findings remain resolved.
Signed-off-by: phernandez <paul@basicmachines.co>
|
Claude finished @phernandez's task in 2m 38s —— View job PR Review: perf(core): speed up vector sync and tune fastembed defaultsTasks:
What changed in
|
| Fix | 0434d9c |
b54c5f2 |
|---|---|---|
Force LF via newline="\n" in write_file_atomic and format_markdown_builtin |
✅ Added | ❌ Reverted |
Compute checksum from disk bytes in write_file and update_frontmatter_with_result |
✅ Added | ✅ Kept |
The newline="\n" override is gone from both write paths, allowing the platform to use its native line endings (CRLF on Windows). The disk-based checksum approach remains.
Analysis
✅ Functional correctness is maintained
I traced the checksum comparison paths to confirm there is no regression:
- Write path:
write_file→ binarycompute_checksum(path)→ stored in DB - Sync comparison:
file_service.compute_checksum(path)(binaryrbmode) → consistent - Initial entity load:
read_file_bytes+compute_checksum(bytes)→ consistent read_file: computes checksum from text-mode string, but callers atentity_service.py:627andentity_service.py:1127discard the checksum (_), so this is not in the sync comparison path
The disk-read checksum is the single authoritative source for sync comparisons. Removing newline="\n" allows CRLF on Windows, but since the checksum is always computed from the actual on-disk bytes, subsequent reads will always agree with what was stored.
✅ CRLF regression test still present
tests/services/test_file_service.py:172 — test_update_frontmatter_checksum_matches_persisted_bytes is unchanged and still validates that FileService hashes the stored file, not the pre-write string. This test covers the root scenario that motivated 0434d9c.
⚠️ Missing decision-point comment (Minor — CLAUDE.md violation)
file_utils.py:117-118:
# Use aiofiles for non-blocking write
async with aiofiles.open(temp_path, mode="w", encoding="utf-8") as f:The original Trigger/Why/Outcome comment from 0434d9c was removed and replaced with a generic "what" comment that has nothing to do with the decision being made here. The comment describes aiofiles (obvious) but says nothing about why newline="\n" was removed. Per CLAUDE.md, decision-point comments must explain:
- Trigger: when does this apply?
- Why: the rationale
- Outcome: what changes downstream
Something like:
# Trigger: caller passes normalized Python strings (LF-only); write_file / update_frontmatter
# already compute the checksum from on-disk bytes, so platform newline translation
# does not cause checksum drift.
# Why: preserving platform-native newlines (CRLF on Windows) keeps files compatible
# with Obsidian and other local editors that expect the host OS line ending.
# Outcome: file bytes may differ across platforms; downstream consumers use
# FileService.compute_checksum(path) (binary mode) to stay consistent.
async with aiofiles.open(temp_path, mode="w", encoding="utf-8") as f:Checklist
| Area | Status | Notes |
|---|---|---|
| Code conventions (CLAUDE.md) | Decision-point comment missing in write_file_atomic |
|
| Python 3.12+ type annotations | ✅ Pass | No new code |
| SQLAlchemy patterns | n/a | Utility change |
| Unit tests | ✅ Pass | CRLF regression test retained and still valid |
| Integration tests | ✅ Pass | |
| 100% coverage | ✅ Pass | |
| No hardcoded secrets | ✅ Pass | |
| No unicode/emoji in output | ✅ Pass | Already fixed in prior commit |
| Windows path compatibility | ✅ Pass | Disk-based checksum keeps sync consistent across platforms |
| All prior review items | ✅ Pass | All addressed in d59b974 |
Overall: Approve with minor suggestion. The functional change is correct — the disk-based checksum approach is sufficient to maintain sync consistency without forcing LF, and the CRLF regression test is still in place. The only item worth fixing before merge is the missing Trigger/Why/Outcome comment in write_file_atomic, which doesn't explain the reasoning behind removing newline="\n".
Signed-off-by: phernandez <paul@basicmachines.co>
|
Claude finished @phernandez's task in 1m 7s —— View job PR Review: perf(core): speed up vector sync and tune fastembed defaultsTasks:
What changed in
|
| Area | Status | Notes |
|---|---|---|
| Code conventions (CLAUDE.md) | ✅ Pass | Literate-style throughout; all decision-point comments present |
| Python 3.12+ type annotations | ✅ Pass | |
| SQLAlchemy 2.0 async | ✅ Pass | |
| FastAPI/Typer conventions | ✅ Pass | |
| 100-char line limit | ✅ Pass | |
| No commented-out code | ✅ Pass | |
| Unit tests | ✅ Pass | +322 lines CLI, +334 lines semantic base, +158 lines SQLite vector |
| Integration tests | ✅ Pass | force_full path validated |
| Edge case coverage | ✅ Pass | Window errors, oversized entities, orphan runtime |
| 100% test coverage | ✅ Pass | # pragma: no cover used appropriately |
| No MCP tool changes | ✅ Pass | |
| DB migrations | ✅ Pass | No schema changes; vector tables recreated at runtime |
| Local-first architecture | ✅ Pass | |
| No hardcoded secrets | ✅ Pass | |
| Input validation | ✅ Pass | |
| No unicode/emoji in output | ✅ Pass | ✓ → done (fixed in prior commit) |
| Windows path compatibility | ✅ Pass | Disk-based checksum resolves cross-platform drift |
started_span timing |
✅ Pass | set_attributes inside span context (fixed in d59b974) |
Magic number 8 |
✅ Pass | _SQLITE_MAX_PREPARE_WINDOW and _FASTEMBED_MAX_THREADS |
| Duplicate delete blocks | ✅ Pass | _delete_entity_chunks_and_finish() helper |
| Provider abstraction boundary | ✅ Pass | runtime_log_attrs() on Protocol |
| Decision-point comment (write_file_atomic) | ✅ Pass | Trigger/Why/Outcome added in this commit |
| Decision-point comment (format_markdown_builtin) | ✅ Pass | Added proactively in this commit |
Overall: Approve. Every item raised across all review rounds has been resolved. The PR delivers a real, benchmark-validated performance improvement with a clean implementation, comprehensive tests, and thorough documentation. Ready to merge.
Signed-off-by: phernandez <paul@basicmachines.co>
|
Claude finished @phernandez's task in 1m 0s —— View job PR Review: perf(core): speed up vector sync and tune fastembed defaultsTasks:
What changed in
|
| Change | Before | After |
|---|---|---|
| Test name | test_update_frontmatter_checksum_matches_persisted_bytes |
test_update_frontmatter_checksum_matches_windows_crlf_persisted_bytes |
| Docstring | "Frontmatter writes should hash the stored file, not the pre-write string." |
"Windows-style CRLF writes should hash the stored file, not the pre-write string." |
| Trigger comment | "Trigger: simulate a writer that persists CRLF bytes like Windows text mode." |
"Trigger: simulate Windows text-mode persistence, where logical LF strings land on disk as CRLF bytes." |
Analysis
✅ Naming and clarity improvements
All three changes make the test's intent more precise. The rename makes the platform-specific regression scenario explicit in the test name itself, so developers reading a failing test output or coverage report immediately know what scenario is covered. The expanded trigger comment follows CLAUDE.md's literate style more closely than the prior one-liner.
✅ No regressions introduced
Test-only commit with zero logic changes. All previously passing review items remain intact.
✅ Trigger/Why/Outcome format preserved
The expanded trigger comment still follows the Trigger/Why/Outcome structure exactly. The Why and Outcome lines are unchanged from 0434d9c; only the Trigger line was made more descriptive.
Final Checklist (all items across all review rounds)
| Area | Status | Notes |
|---|---|---|
| Code conventions (CLAUDE.md) | ✅ Pass | Literate-style throughout; all decision-point comments present |
| Python 3.12+ type annotations | ✅ Pass | |
| SQLAlchemy 2.0 async | ✅ Pass | |
| FastAPI/Typer conventions | ✅ Pass | |
| 100-char line limit | ✅ Pass | |
| No commented-out code | ✅ Pass | |
| Unit tests | ✅ Pass | +322 lines CLI, +334 lines semantic base, +158 lines SQLite vector |
| Integration tests | ✅ Pass | force_full path validated |
| Edge case coverage | ✅ Pass | Window errors, oversized entities, orphan runtime |
| 100% test coverage | ✅ Pass | # pragma: no cover used appropriately |
| No MCP tool changes | ✅ Pass | |
| DB migrations | ✅ Pass | No schema changes; vector tables recreated at runtime |
| Local-first architecture | ✅ Pass | |
| No hardcoded secrets | ✅ Pass | |
| Input validation | ✅ Pass | |
| No unicode/emoji in output | ✅ Pass | ✓ → done (fixed in prior commit) |
| Windows path compatibility | ✅ Pass | Disk-based checksum resolves cross-platform drift |
started_span timing |
✅ Pass | set_attributes inside span context (fixed in d59b974) |
Magic number 8 |
✅ Pass | _SQLITE_MAX_PREPARE_WINDOW and _FASTEMBED_MAX_THREADS |
| Duplicate delete blocks | ✅ Pass | _delete_entity_chunks_and_finish() helper |
| Provider abstraction boundary | ✅ Pass | runtime_log_attrs() on Protocol |
| Decision-point comment (write_file_atomic) | ✅ Pass | Trigger/Why/Outcome added in 3f23b3e |
| Decision-point comment (format_markdown_builtin) | ✅ Pass | Added in 3f23b3e |
| Windows CRLF regression test clarity | ✅ Pass | Test name and trigger comment sharpened in this commit |
Overall: Approve. This is a pure test-clarity commit with no logic changes. The renamed test and expanded trigger comment make the Windows CRLF regression scenario unambiguous. Every item raised across all review rounds remains resolved. Ready to merge.
Signed-off-by: phernandez <paul@basicmachines.co>
|
Claude finished @phernandez's task in 2m 32s —— View job PR Review: perf(core): speed up vector sync and tune fastembed defaultsTasks:
What changed in
|
| Area | Status | Notes |
|---|---|---|
| Code conventions (CLAUDE.md) | ✅ Pass | Trigger/Why/Outcome comment present and correct |
| Python 3.12+ type annotations | ✅ Pass | No new code |
| SQLAlchemy 2.0 async | n/a | |
| FastAPI/Typer conventions | n/a | |
| 100-char line limit | ✅ Pass | |
| No commented-out code | ✅ Pass | |
| Unit tests | ✅ Pass | +322 lines CLI, +334 lines semantic base, +158 lines SQLite vector |
| Integration tests | ✅ Pass | force_full path validated |
| Batch indexer test platform safety | ✅ Pass | read_text() replaces LF literals in 2 assertions |
| Edge case coverage | ✅ Pass | Window errors, oversized entities, orphan runtime |
| 100% test coverage | ✅ Pass | # pragma: no cover used appropriately |
| No MCP tool changes | ✅ Pass | |
| DB migrations | ✅ Pass | No schema changes |
| Local-first architecture | ✅ Pass | |
| No hardcoded secrets | ✅ Pass | |
| Input validation | ✅ Pass | |
| No unicode/emoji in output | ✅ Pass | ✓ → done (fixed in prior commit) |
| Windows path compatibility | ✅ Pass | Disk-based checksum resolves cross-platform drift |
read_file checksum consistency |
✅ Pass | Now uses self.compute_checksum(path) like all other FileService methods |
started_span timing |
✅ Pass | Fixed in d59b974 |
Magic number 8 |
✅ Pass | _SQLITE_MAX_PREPARE_WINDOW and _FASTEMBED_MAX_THREADS |
| Duplicate delete blocks | ✅ Pass | _delete_entity_chunks_and_finish() helper |
| Provider abstraction boundary | ✅ Pass | runtime_log_attrs() on Protocol |
| Decision-point comments | ✅ Pass | All present and complete across all changed files |
Overall: Approve. This commit completes the disk-based checksum strategy by closing the last remaining read_file gap. The change is minimal, correctly scoped, and consistent with the approach established across this PR. All items raised across all review rounds remain resolved. Ready to merge.
Signed-off-by: phernandez <paul@basicmachines.co>
|
Claude finished @phernandez's task in 4m 17s —— View job PR Review: perf(core): speed up vector sync and tune fastembed defaultsTasks:
What changed in
|
| Area | Status | Notes |
|---|---|---|
| Code conventions (CLAUDE.md) | ✅ Pass | Literate-style throughout; Trigger/Why/Outcome on all decision branches |
| Python 3.12+ type annotations | ✅ Pass | |
| SQLAlchemy 2.0 async | ✅ Pass | |
| FastAPI/Typer conventions | ✅ Pass | |
| 100-char line limit | ✅ Pass | |
| No commented-out code | ✅ Pass | |
| Unit tests | ✅ Pass | +322 lines CLI, +334 lines semantic base, +158 lines SQLite vector |
| Integration tests | ✅ Pass | force_full path validated |
| Edge case coverage | ✅ Pass | Window errors, oversized entities, orphan runtime |
| 100% test coverage | ✅ Pass | # pragma: no cover used appropriately |
| No MCP tool changes | ✅ Pass | |
| DB migrations | ✅ Pass | No schema changes; vector tables recreated at runtime |
| Local-first architecture | ✅ Pass | |
| No hardcoded secrets | ✅ Pass | |
| Input validation | ✅ Pass | |
| No unicode/emoji in output | ✅ Pass | ✓ → done (fixed in prior commit) |
| Windows path compatibility | ✅ Pass | Disk-based checksum resolves cross-platform drift |
started_span timing |
✅ Pass | Fixed in d59b974 |
Magic number 8 |
✅ Pass | _SQLITE_MAX_PREPARE_WINDOW and _FASTEMBED_MAX_THREADS |
| Duplicate delete blocks | ✅ Pass | _delete_entity_chunks_and_finish() helper |
| Provider abstraction boundary | ✅ Pass | runtime_log_attrs() on Protocol |
| Decision-point comments | ✅ Pass | All present and complete across all changed files |
| Windows CRLF assertion accuracy | ✅ Pass | read_bytes().decode("utf-8") matches binary-decoded prepared.content exactly |
Overall: Approve. This commit fixes a subtle Windows test divergence from 7421cfc: read_text() normalizes CRLF→LF, which would diverge from prepared.content (binary-decoded, CRLF preserved) on Windows. The switch to read_bytes().decode("utf-8") makes both sides of the assertion follow the same decoding path. On Linux the two are equivalent, so CI stays green. All items raised across all review rounds remain resolved. Ready to merge.
Summary
bm reindex --fullmatch cloud-style full rebuild semantics more closely for search and embeddingsparallel=1, with runtime settings logged at batch startValidation
just fast-checkuv run bm reindex --project main --fullBenchmark notes
11:15total,embed_seconds_total=625.598threads=8, parallel=1: about9:55total,embed_seconds_total=546.204threads=6, parallel=1: about9:10total in one explicit run,embed_seconds_total=497.861null:9:04.46total,embed_seconds_total=486.343,queue_wait_seconds_total=2717.249